Exploring the Vector Space Model for Finding Verb Synonyms in Portuguese
نویسندگان
چکیده
We explore the performance of the Vector Space Model (VSM) in finding verb synonyms in Portuguese by analyzing the impact of three operating parameters: (i) the weighting function, (ii) the context window used for automatically extracting features, and (iii) the minimum number of vector features. We rely on distributional statistics taken from a large n-gram database to build feature vectors, using minimal linguistic pre-processing. Automatic evaluation of synonym candidates using gold-standard information from the OpenOffice and Wiktionary thesaurus shows that low frequency features carry most information regarding verb similarity, and that a [0, +2] window carries more information than a [-2, 0] window. We show that satisfactory precision levels require vectors with 50 or more non-nil components. Manual evaluation over a set of declarative verbs and psychological verbs show that VSM-based approaches achieve good precision in finding verb synonyms for Portuguese, even when using minimal linguistic knowledge. This lead us to proposing a performance baseline for this task.
منابع مشابه
Reachability checking in complex and concurrent software systems using intelligent search methods
Software system verification is an efficient technique for ensuring the correctness of a software product, especially in safety-critical systems in which a small bug may have disastrous consequences. The goal of software verification is to ensure that the product fulfills the requirements. Studies show that the cost of finding and fixing errors in design time is less than finding and fixing the...
متن کاملFinding High-Frequent Synonyms of A Domain-Specific Verb in English Sub-Language of MEDLINE Abstracts Using WordNet
The task of binary relation extraction in IE [3] is based mainly on high-frequent verbs and patterns. During the extraction of a specific relation from MEDLINE English abstracts, it is noticed that besides the high-frequent verb itself which represents the specific relation, some other word forms, such as the nominal and adjective forms of this verb, as well as its synonyms, also play a very im...
متن کاملSpace Vector Control Scheme of Three Level ZSI Applied to Wind Energy Systems
In this paper the Space Vector Control Scheme is implemented for a Wind Energy System using Three Level Impedance Source Inverter (ZSI). The wind energy system uses a Self Excited Induction generator (SEIG) which is the most emerging application in the field of Wind Energy Conversion System (WECS). The proposed system is modelled with a generator-side Diode Bridge Rectifier and a Stand-Alone si...
متن کاملA Hybrid Meta-heuristic Approach to Cope with State Space Explosion in Model Checking Technique for Deadlock Freeness
Model checking is an automatic technique for software verification through which all reachable states are generated from an initial state to finding errors and desirable patterns. In the model checking approach, the behavior and structure of system should be modeled. Graph transformation system is a graphical formal modeling language to specify and model the system. However, modeling of large s...
متن کاملEnriching a Portuguese WordNet using Synonyms from a Monolingual Dictionary
In this article we present an exploratory approach to enrich a WordNet-like lexical ontology with the synonyms present in a standard monolingual Portuguese dictionary. The dictionary was converted from PDF into XML and senses were automatically identified and annotated. This allowed us to extract them, independently of definitions, and to create sets of synonyms (synsets). These synsets were th...
متن کامل